Rewrite audio DSP plan around fingerprint-first architecture by abossard · Pull Request #3 · abossard/qlcplus

abossard · 2026-05-05T22:46:04Z

Replace the prior plan with a three-library pipeline: Essentia (offline
features), Olaf (acoustic fingerprinting), aubio (live fallback), backed by
a single SQLite database for tracks, features, fingerprints, and profiles.

Key changes:

Add live track recognition via Olaf so cached features replay in sync
with whatever the DJ is playing, without loopback.
AudioFeatures becomes the single view consumed by scripts, widgets, and
MCP tools, populated either by LiveAudioAnalyzer or CachedAudioAnalyzer.
VCAudioTrigger is rewritten as the audio control center with a library
browser, recognition badge, drop/build/key indicators, and the existing
envelope/AGC/trigger/spectral panels.
Drop all backwards-compatibility paths: ledfx_compat.js, audio_common.js,
legacy per-bar triggers, BeatTracker, and old AudioParams DSP fields are
scheduled for deletion in M7.
Sequence the work fingerprint-first: M1 proves Olaf can lock onto EDM
through DJ EQ and pitch shift before any further engine work.
Accept AGPL-3.0 for the combined binary when Essentia is linked; provide
a -Daudio_essentia=OFF build flag for downstream redistributors.

Adds milestones M0-M8, updated decisions DD1-DD18, SQLite schema,
AudioFeatures struct, and an FMA/Jamendo CC test corpus plan.

Replace the prior plan with a three-library pipeline: Essentia (offline features), Olaf (acoustic fingerprinting), aubio (live fallback), backed by a single SQLite database for tracks, features, fingerprints, and profiles. Key changes: - Add live track recognition via Olaf so cached features replay in sync with whatever the DJ is playing, without loopback. - AudioFeatures becomes the single view consumed by scripts, widgets, and MCP tools, populated either by LiveAudioAnalyzer or CachedAudioAnalyzer. - VCAudioTrigger is rewritten as the audio control center with a library browser, recognition badge, drop/build/key indicators, and the existing envelope/AGC/trigger/spectral panels. - Drop all backwards-compatibility paths: ledfx_compat.js, audio_common.js, legacy per-bar triggers, BeatTracker, and old AudioParams DSP fields are scheduled for deletion in M7. - Sequence the work fingerprint-first: M1 proves Olaf can lock onto EDM through DJ EQ and pitch shift before any further engine work. - Accept AGPL-3.0 for the combined binary when Essentia is linked; provide a -Daudio_essentia=OFF build flag for downstream redistributors. Adds milestones M0-M8, updated decisions DD1-DD18, SQLite schema, AudioFeatures struct, and an FMA/Jamendo CC test corpus plan.

Adapt the architecture after rubberducking with research and after the direction "live features first, best possible, low latency": - Live AudioAnalyzer is M1, shippable on its own through M5. Cached features (M6), Olaf identification (M7), chromagram tracking (M8), and Tier-1 DJ protocols (M9) extend the same AudioFeatures view incrementally. - Olaf is no longer used for continuous lock. It runs one-shot on a rolling ~5 s buffer to identify the track and seed initial position. This avoids Olaf's known brittleness past ~3% time-stretch since identification needs only one good match, not continuous lock. - New PositionSource abstraction with three tiers: DJ-software protocols (OS2L beat counter + cached beat grid, Pro DJ Link, StagelinQ), chromagram cross-correlation against cached chroma with a small speed search, and aubio + internal clock fallback. Highest-priority confident-and-fresh tier wins; per-source latency offsets calibrated against onsets. - SQLite schema adds a `chroma` table holding 12-bin chroma at ~10 Hz per track for the chromagram tracker. - Live latency target codified: <10 ms input-to-onset, <1 ms shared analyzer budget, <0.5 ms per AudioChannel; no heap allocation per frame; lock-free SPSC ring for snapshots. - AudioIdentifier is an interface with a Panako backend ready as a build option for environments where DJ pitch-bend during the ID window matters. - VCAudioTrigger ships its live panels in M4; library browser, recognition badge, drop/build/key indicators, and position-source picker are added incrementally in M6-M9 in the same chrome.

abossard · 2026-05-05T22:46:42Z

@copilot please finish the live implementation

Agent-Logs-Url: https://github.com/abossard/qlcplus/sessions/e4afee83-41dd-431b-ac2f-5dc5cc839725 Co-authored-by: abossard <86611+abossard@users.noreply.github.com>

Copilot · 2026-05-05T22:55:26Z

@copilot please finish the live implementation

Implemented the first live audio path: AudioCapture now publishes a fixed AudioFeatures snapshot with live RMS/peak/crest, 32-band normalized spectrum, perceptual bands, spectral shape/flux/onset, and beat/BPM state via LiveAudioAnalyzer. Addressed in 03bc4ae.

Agent-Logs-Url: https://github.com/abossard/qlcplus/sessions/f5a26a26-1240-49f7-a73b-4b9c0d29c0ce Co-authored-by: abossard <86611+abossard@users.noreply.github.com>

abossard · 2026-05-05T23:49:17Z

@codex[agent] finish the implementation with tests and actual mo3 files

chatgpt-codex-connector · 2026-05-05T23:49:24Z

To use Codex here, create a Codex account and connect to github.

Co-authored-by: abossard <86611+abossard@users.noreply.github.com>

Codex · 2026-05-05T23:55:26Z

@codex[agent] finish the implementation with tests and actual mo3 files

Added LiveAudioAnalyzer unit tests plus deterministic M0–M3 test vectors under engine/test/liveaudioanalyzer/data/, and fixed silence frames to force beat/BPM back to zero in AudioCapture::run(). Commit: 9ba147a.

claude added 2 commits May 5, 2026 21:37

Copilot started work on behalf of abossard May 5, 2026 22:46 View session

Copilot AI and others added 3 commits May 5, 2026 22:51

Add live audio feature analyzer scaffold

87c8657

Agent-Logs-Url: https://github.com/abossard/qlcplus/sessions/e4afee83-41dd-431b-ac2f-5dc5cc839725 Co-authored-by: abossard <86611+abossard@users.noreply.github.com>

Address live analyzer review feedback

a5dc19f

Agent-Logs-Url: https://github.com/abossard/qlcplus/sessions/e4afee83-41dd-431b-ac2f-5dc5cc839725 Co-authored-by: abossard <86611+abossard@users.noreply.github.com>

Tighten live analyzer follow-up

03bc4ae

Agent-Logs-Url: https://github.com/abossard/qlcplus/sessions/e4afee83-41dd-431b-ac2f-5dc5cc839725 Co-authored-by: abossard <86611+abossard@users.noreply.github.com>

Copilot finished work on behalf of abossard May 5, 2026 22:56

Copilot started work on behalf of abossard May 5, 2026 23:26 View session

Document M0-M5 verification gaps

86cdf22

Agent-Logs-Url: https://github.com/abossard/qlcplus/sessions/f5a26a26-1240-49f7-a73b-4b9c0d29c0ce Co-authored-by: abossard <86611+abossard@users.noreply.github.com>

Copilot finished work on behalf of abossard May 5, 2026 23:38

Codex started work on behalf of abossard May 5, 2026 23:49 View session

Add LiveAudioAnalyzer unit tests and vectors

9ba147a

Co-authored-by: abossard <86611+abossard@users.noreply.github.com>

Codex finished work on behalf of abossard May 5, 2026 23:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rewrite audio DSP plan around fingerprint-first architecture#3

Rewrite audio DSP plan around fingerprint-first architecture#3
abossard wants to merge 7 commits into
mcp-serverfrom
claude/research-audio-analysis-N5WGL

abossard commented May 5, 2026

Uh oh!

abossard commented May 5, 2026

Uh oh!

Copilot AI commented May 5, 2026

Uh oh!

abossard commented May 5, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 5, 2026

Uh oh!

Codex AI commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

abossard commented May 5, 2026

Uh oh!

abossard commented May 5, 2026

Uh oh!

Copilot AI commented May 5, 2026

Uh oh!

abossard commented May 5, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 5, 2026

Uh oh!

Codex AI commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants